AITopics | specific model

Collaborating Authors

specific model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Procedural Skill Explanations via Constrained Generation: A Symbolic-LLM Hybrid Architecture

Dass, Rahul, Bowlin, Thomas, Li, Zebing, Jin, Xiao, Goel, Ashok

arXiv.org Artificial IntelligenceNov-27-2025

In procedural skill learning, instructional explanations must convey not just steps, but the causal, goal-directed, and compositional logic behind them. Large language models (LLMs) often produce fluent yet shallow responses that miss this structure. We present Ivy, an AI coaching system that delivers structured, multi-step explanations by combining symbolic Task-Method-Knowledge (TMK) models with a generative interpretation layer-an LLM that constructs explanations while being constrained by TMK structure. TMK encodes causal transitions, goal hierarchies, and problem decompositions, and guides the LLM within explicit structural bounds. We evaluate Ivy against responses against GPT and retrieval-augmented GPT baselines using expert and independent annotations across three inferential dimensions. Results show that symbolic constraints consistently improve the structural quality of explanations for "how" and "why" questions. This study demonstrates a scalable AI for education approach that strengthens the pedagogical value of AI-generated explanations in intelligent coaching systems.

explanation, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2511.20942

Country:

Europe > Austria > Vienna (0.14)
South America > Brazil (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

Khan, Muhammad Tayyab, Yong, Zane, Chen, Lequn, Tan, Jun Ming, Feng, Wenhe, Moon, Seung Ki

arXiv.org Artificial IntelligenceSep-4-2025

Accurate extraction of key information from 2D engineering drawings is crucial for high - precision manufacturing. Manual extraction is slow and labor - intensive, while traditional Optical Character Recognition (OCR) techniques often struggle with complex layouts and overlapping symbols, resulting in unstructured outputs . To address these challenges, this paper proposes a novel hybrid deep learning framework for structured information extraction by integrat ing an O riented B ounding B ox (OBB) detection model with a transformer - based document parsing model (Donut). An in - house annotated dataset is used to train YOLOv11 for detect ing nine key categories: Geometric Dimensioning and Tolerancing (GD&T), General Tolerances, Measures, Materials, Notes, Radii, Surface Roughness, Threads, and Title Blocks. Detected OBBs are cropped into image s and labeled to fine - tune Donut for structured JSON output. Fine - tuning strategies include a single model trained across all categories and category - specific models . Results show that the single model consistently outperforms category - specific ones across all evaluation metrics, achieving higher precision (94.77% for GD&T), recall (100% for most categories), and F1 score (97.3%), while reducing hallucination s (5.23%) . The proposed framework improves accuracy, reduces manual effort, and supports scalable deployment in precision - driven industries.

category, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.0153

Country: Asia > Singapore (0.07)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MAGIC: Near-Optimal Data Attribution for Deep Learning

Ilyas, Andrew, Engstrom, Logan

arXiv.org Machine LearningApr-23-2025

A fundamental problem when building machine learning syste ms is to predict counterfactuals about model behavior. For example, scaling laws [ KMH+20; Has21; MRB+23 ] aim to predict the performance of systems trained with more data and more co mpute than is currently available; interpretability techniques [ KWG+18 ] predict how models behave under counterfactual inputs. Analogously, in this work we study predictive data attribution (or datamodeling [ IPE+22 ]), where the goal is to predict how a model would behave if it had been tr ained on a different dataset. This well-studied problem encompasses, e.g., estimating the ef fect (on the resulting trained model's predictions) of modifying a training example [ KL17 ], removing a group of training examples [ KAT+19; BNL+22; PGI+23 ], or adding entire training data sources [ LSZ+24 ]. Predictive data attribution in large-scale settings is cha llenging: it requires simulating training a model on a different dataset without actually training [ GWP+23; IGE+24 ]. In "classical" settings--when learning corresponds to minimizing a convex loss--statistical tools like the influence function [ Ham47 ] allow us to accurately and efficiently estimate how differen t training data choices change trained model predictions [ RM18; KAT+19; GSL+19 ]. However, in the non-convex settings that are ubiquitous in natural domains like langua ge/vision, current methods are less effective. Indeed, the best existing methods produce estimat es that typically (a) only moderately correlate with the ground truth [ BPF21; BNL+22; PGI+23 ] and (b) incur large absolute error [ BNL+22 ].

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Machine Learning

2504.1643

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting

Li, Zhe, Qiu, Xiangfei, Chen, Peng, Wang, Yihang, Cheng, Hanyin, Shu, Yang, Hu, Jilin, Guo, Chenjuan, Zhou, Aoying, Wen, Qingsong, Jensen, Christian S., Yang, Bin

arXiv.org Artificial IntelligenceNov-26-2024

Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. While TSF methods are emerging these days, many of them require domain-specific data collection and model training and struggle with poor generalization performance on new domains. Foundation models aim to overcome this limitation. Pre-trained on large-scale language or time series data, they exhibit promising inferencing capabilities in new or unseen data. This has spurred a surge in new TSF foundation models. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models. FoundTS covers a variety of TSF foundation models, including those based on large language models and those pretrained on time series. Next, FoundTS supports different forecasting strategies, including zero-shot, few-shot, and full-shot, thereby facilitating more thorough evaluations. Finally, FoundTS offers a pipeline that standardizes evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, thereby facilitating fair evaluations. Building on this, we report on an extensive evaluation of TSF foundation models on a broad range of datasets from diverse domains and with different statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing foundation models, and we identify directions for future model design. We make our code and datasets available at https://anonymous.4open.science/r/FoundTS-C2B0.

dataset, forecasting, foundation model, (11 more...)

arXiv.org Artificial Intelligence

2410.11802

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom (0.04)
(3 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Energy > Power Industry (1.00)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Aligning Models with Their Realization through Model-based Systems Engineering

Zenz, Lovis Justin Immanuel, Heiland, Erik, Hillmann, Peter, Karcher, Andreas

arXiv.org Artificial IntelligenceJun-18-2024

In this paper, we propose a method for aligning models with their realization through the application of model-based systems engineering. Our approach is divided into three steps. (1) Firstly, we leverage domain expertise and the Unified Architecture Framework to establish a reference model that fundamentally describes some domain. (2) Subsequently, we instantiate the reference model as specific models tailored to different scenarios within the domain. (3) Finally, we incorporate corresponding run logic directly into both the reference model and the specific models. In total, we thus provide a practical means to ensure that every implementation result is justified by business demand. We demonstrate our approach using the example of maritime object detection as a specific application (specific model / implementation element) of automatic target recognition as a service reoccurring in various forms (reference model element). Our approach facilitates a more seamless integration of models and implementation, fostering enhanced Business-IT alignment.

implementation, reference model, specific model, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/AEIS61544.2023.00018

2407.09513

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Ghostbuster: detecting text ghostwritten by large language models

AIHubJan-2-2024, 12:30:00 GMT

Large language models like ChatGPT write impressively well--so well, in fact, that they've become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on.

ghostbuster, language model, probability, (16 more...)

AIHub

Industry:

Media > Film (0.63)
Leisure & Entertainment (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Safer Together: Machine Learning Models Trained on Shared Accident Datasets Predict Construction Injuries Better than Company-Specific Models

Tixier, Antoine J. -P., Hallowell, Matthew R.

arXiv.org Artificial IntelligenceJan-9-2023

In this study, we capitalized on a collective dataset repository of 57k accidents from 9 companies belonging to 3 domains and tested whether models trained on multiple datasets (generic models) predicted safety outcomes better than the company-specific models. We experimented with full generic models (trained on all data), per-domain generic models (construction, electric T&D, oil & gas), and with ensembles of generic and specific models. Results are very positive, with generic models outperforming the company-specific models in most cases while also generating finer-grained, hence more useful, forecasts. Successful generic models remove the needs for training company-specific models, saving a lot of time and resources, and give small companies, whose accident datasets are too limited to train their own models, access to safety outcome predictions. It may still however be advantageous to train specific models to get an extra boost in performance through ensembling with the generic models. Overall, by learning lessons from a pool of datasets whose accumulated experience far exceeds that of any single company, and making these lessons easily accessible in the form of simple forecasts, generic models tackle the holy grail of safety cross-organizational learning and dissemination in the construction industry.

artificial intelligence, comp, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2301.03567

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

A Journey into the Fabulous Applications of Transformers -- Part 2 – Towards AI

#artificialintelligenceDec-14-2022, 21:00:10 GMT

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Transformer architecture is widely used in Natural Language Processing and it highly contributed to the need-of-the-hour Large Language Models (LLM).

artificial intelligence, large language model, natural language, (18 more...)

#artificialintelligence

Country: Europe > Finland > Uusimaa > Helsinki (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Federated Learning based Energy Demand Prediction with Clustered Aggregation

Tun, Ye Lin, Thar, Kyi, Thwal, Chu Myaet, Hong, Choong Seon

arXiv.org Artificial IntelligenceOct-27-2022

To reduce negative environmental impacts, power stations and energy grids need to optimize the resources required for power production. Thus, predicting the energy consumption of clients is becoming an important part of every energy management system. Energy usage information collected by the clients' smart homes can be used to train a deep neural network to predict the future energy demand. Collecting data from a large number of distributed clients for centralized model training is expensive in terms of communication resources. To take advantage of distributed data in edge systems, centralized training can be replaced by federated learning where each client only needs to upload model updates produced by training on its local data. These model updates are aggregated into a single global model by the server. But since different clients can have different attributes, model updates can have diverse weights and as a result, it can take a long time for the aggregated global model to converge. To speed up the convergence process, we can apply clustering to group clients based on their properties and aggregate model updates from the same cluster together to produce a cluster specific global model. In this paper, we propose a recurrent neural network based energy demand predictor, trained with federated learning on clustered clients to take advantage of distributed data and speed up the convergence process.

artificial intelligence, global model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/BigComp51126.2021.00039

2210.1585

Country: North America > Canada > British Columbia (0.05)

Genre: Research Report (0.50)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Identification of 12 cancer types through genome deep learning - Scientific Reports

#artificialintelligenceAug-11-2021, 00:30:05 GMT

Cancer is a major cause of death worldwide, and an early diagnosis is required for a favorable prognosis. Histological examination is the gold standard for cancer identification; however, large amount of inter-observer variability exists in histological diagnosis. Numerous studies have shown cancer genesis is accompanied by an accumulation of harmful mutations, potentiating the identification of cancer based on genomic information. We have proposed a method, GDL (genome deep learning), to study the relationship between genomic variations and traits based on deep neural networks. We analyzed 6,083 samples’ WES (Whole Exon Sequencing) mutations files from 12 cancer types obtained from the TCGA (The Cancer Genome Atlas) and 1,991 healthy samples’ WES data from the 1000 Genomes project. We constructed 12 specific models to distinguish between certain type of cancer and healthy tissues, a total-specific model that can identify healthy and cancer tissues, and a mixture model to distinguish between all 12 types of cancer based on GDL. We demonstrate that the accuracy of specific, mixture and total specific model are 97.47%, 70.08% and 94.70% for cancer identification. We developed an efficient method for the identification of cancer based on genomic information that offers a new direction for disease diagnosis.

cancer type, genome deep learning, identification, (6 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback